Analyses of voice and glottographic signals in singing and speech

نویسنده

  • ANDREAS SELAMTZIS
چکیده

Recent advances in machine learning and time series analysis techniques have brought new perspectives to a great number of scientific fields. This thesis contributes applications of such techniques to voice analysis, in an attempt to extract information on the vibration of the vocal folds as such, as well as on the radiated acoustic signal. The data that was analyzed in this work are acoustic recordings, electroglottographic (EGG) signals and transnasal highspeed videoendoscopic images. The data analysis techniques are primarily based on clustering, i.e., grouping of data based on similarity, and sample entropy analysis, i.e., quantifying the degree of irregularity in a given signal. The experiments were conducted so as to provide data for different types of vibratory behaviors (or vibratory states) of the vocal folds. Clustering was used in order to categorize in an unsupervised fashion these different vibratory states, based solely on the electroglottographic signal, or the glottal area waveform, or both. Sample entropy was utilized as an indicator of instabilities, when subjects produced voiced sounds using irregular vibratory patterns, such as register breaks, intermittent diplophonia, and other types of irregularities. The prominent role of sound pressure level and fundamental frequency motivated further study of the relationship between them and the shape of the electroglottographic waveform. Graphical representations were created to visualize the relationship between different vibratory behaviors with fundamental frequency and sound pressure level. The EGG waveform shape was seen to depend strongly on sound pressure level and somewhat less on fundamental frequency. In very soft phonation, the almost sinusoidal waveform of the EGG suggests that studying the EGG using clusters may give a better representation compared to conventional time-domain metrics. The paradigm of the clustering was later applied in synchronous recordings of electroglottogram and glottal area waveforms in professional tenor singers. Different vibratory states were classified successfully using clustering, and the electroglottogram was seen to be as good as the glottal area waveform for such a classification task. The last part of this work concerns voices from subjects with organic dysphonia. A study was dedicated to investigate how vowel context (sustained versus excerpted from speech) can affect the power of quantitative acoustic measures to discriminate dysphonic subjects from controls. Two acoustic voice quality measures were used: the cepstral peak prominence (smoothed) and sample entropy. The cepstral peak prominence (smoothed) showed better discriminatory power with excerpted vowels, while sample entropy with sustained vowels. Additionally, it was found that sample entropy was strongly correlated with cepstral peak prominence (smoothed) and with the perceptual quality of breathiness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

A Book to Scare and Stop Singing A critique of Larynx as Instrument and the Challenges to Vocal Production

The production of scientific resources in the field of recognition and application of sound is one of the most urgent necessities for Iranian society of voice training, speech and singing. In a society where the voices of its voice actors and speakers are unfortunately often harsh, piercing and, artificial, and where these types of voices have become a trend, the necessity of research in this a...

متن کامل

Segmentation of Speech Signals in Template-based Speech to Singing Conversion

Singing voice synthesis has found numerous applications in the entertainment industry over the recent years. The template-based personalized singing voice synthesis method is a new method of generating high quality singing voice, which synthesizes the singing voice by means of conversion from the narrated lyrics of a song. In this synthesis method, template speaking and singing voices are first...

متن کامل

Glottal Closure Instant detection from a glottal shape estimate

The GCI detection is a common problem in voice analysis used for voice transformation and synthesis. The proposed innovative idea is to use a glottal shape estimate and a standard lips radiation model instead of the common pre-emphasis when computing the vocal-tract filter estimate. The time-derivative glottal source is then computed from the division in frequency of the speech spectrum by the ...

متن کامل

Vibrato in Singing Voice: The Link between Source-Filter and Sinusoidal Models

The application of inverse filtering techniques for high-quality singing voice analysis/synthesis is discussed. In the context of source-filter models, inverse filtering provides a noninvasive method to extract the voice source, and thus to study voice quality. Although this approach is widely used in speech synthesis, this is not the case in singing voice. Several studies have proved that inve...

متن کامل

Phonetic segmentation of singing voice using MIDI and parallel speech

When analyzing singing voice signal, it is required to know the boundaries of each phonetic unit in the singing voice samples. However, due to prolonged vowels in the singing voice, it is not easy to accurately align a singing voice with the phonetic sequence of its lyrics by conventional speech recognition approach. This paper proposes a solution for the phonetic annotation of the singing voic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018